Evaluating responses by ChatGPT to farmers’ questions on irrigated lowland rice cultivation in Nigeria

The limited number of agricultural extension agents (EAs) in sub-Saharan Africa limits farmers’ access to extension services. Artificial intelligence (AI) assistants could potentially aid in providing answers to farmers’ questions. The objective of this study was to evaluate the ability of an AI chatbot assistant (ChatGPT) to provide quality responses to farmers’ questions. We compiled a list of 32 questions related to irrigated rice cultivation from farmers in Kano State, Nigeria. Six EAs from the state were randomly selected to answer these questions. Their answers, along with those of ChatGPT, were assessed by four evaluators in terms of quality and local relevancy. Overall, chatbot responses were rated significantly higher quality than EAs’ responses. Chatbot responses received the best score nearly six times as often as the EAs’ (40% vs. 7%). The evaluators preferred chatbot responses to EAs in 78% of cases. The topics for which the chatbot responses received poorer scores than those by EAs included planting time, seed rate, and fertilizer application rate and timing. In conclusion, while the chatbot could offer an alternative source for providing agricultural advisory services to farmers, incorporating site-specific input rate-and-timing agronomic practices into AI assistants is critical for their direct use by farmers.

In sub-Saharan Africa (SSA), rice productivity is often low due to sub-optimal crop management practices by smallholder farmers [1][2][3] .Farmers have limited access to agricultural extension services due to the limited number of extension agents (EAs), which results in many rice farmers not having access to updated advice for rice production 4,5 .Furthermore, within rural socio-cultural systems, EAs often do not effectively reach women farmers.In some areas in SSA, women are negatively affected by socio-cultural and religious constraints, which forbid them from communicating freely with men outside their families 5 .A wide variety of technology dissemination and scaling tools (rural radio, videos, etc.) have been developed and used to reach women farmers 6,7 .A dissemination approach in which women service providers reach women farmers has been also proposed for providing field-specific recommendations to farmers, which requires service providers to have digital technologies (smartphone, tablet) 5 .While further efforts are needed to improve access to electricity and internet to aid the adoption of digital extension services in the rural agrarian communities in SSA, recent development of artificial intelligence (AI) assistance is an unexplored resource for addressing challenges farmers face.One such platform, ChatGPT, represents a new generation of AI technologies driven by advances in large language models 8 .A recent study on health care reported that although the system was not developed to provide health care, the chatbot responses were preferred over physician responses and rated significantly higher for both quality and empathy 9 .However, its ability to help address farmers' questions on rice cultivation in SSA is unexplored.
Therefore, the objective of this study was to evaluate the ability of an AI chatbot assistant (ChatGPT) to provide quality responses to farmers' questions on rice production.We tested ChatGPT's ability to respond with high-quality answers to farmers' questions, by comparing the chatbot responses with EAs' responses to questions in Kano State, one of major rice producing areas in northern Nigeria 10,11 .

Results
Table 1 shows questions related to rice production, which are based on answers from 107 interviewed farmers about questions they want to ask EAs for improving their rice production.Popular questions mentioned by farmers were on types of inputs (variety, fertilizer, herbicide).In terms of number of questions in each intervention area, crop establishment, insect and disease management, and weed management had most (5, 5, and   (10 [2-45]) (Fig. 1).
On average over 32 questions, evaluators rated chatbot responses significantly higher quality than responses by EAs without and with extension materials by 19 and 15% (P < 0.01) (Table 3).The mean rating for chatbot responses corresponded to a "good" response (3.8), whereas those for EAs' responses without and with extension materials corresponded to an acceptable response (3.2 and 3.3, respectively).There was no significant difference in scores between EAs' responses without and with extension materials.The Pearson correlation coefficient between scores of responses by EAs without and with extension materials was positive and significant (r = 0.71, P < 0.01).The correlation coefficients between scores of responses by chatbot and EAs without and with extension material were not significant (r = − 0.13, P > 0.05; r = − 0.15, P > 0.05).
The proportion of responses rated very good quality (5; range between 1 and 5) was significantly higher (p < 0.05) for chatbot responses than for those of EAs without and with extension materials (Table 4).The chatbot achieved the best score nearly six times as often as EAs (40% vs. 6% and 8%).In contrast, the proportion of responses rated acceptable was significantly lower for chatbot compared to EAs without and with extension Table 1.List of questions used for this study, and the target area in terms of agronomic practice.Questions are in order of number of farmers giving the same or similar questions (most to fewest).Total number of farmers are higher than 107 farmers interviewed, as farmers gave up to five questions.4).There was no significant difference in the number of responses rated poor and very poor between the chatbot and EAs without and with extension materials (Table 4).Across the 32 questions, the evaluators preferred the chatbot response over the responses by EAs without and with extension materials for 78% and 69%, respectively (Fig. 2).When we looked at the responses where the chatbot had lower scores than those authored by EAs (questions 11, 12, and 23 in Tables 1 and 3) and having lower score than 3 (14 and 16), we found that the chatbot provided inaccurate information (Table 5)-i.e., the chatbot-recommended seed rate was too high (11); planting time was not correct in dry season (12); financial services were not available ( 14); soil testing was not recommended (16); recommended number of seedlings per hill was different but should not be different between the two seasons (23).
After reviewing the chatbot responses, five out of the six EAs who had answered the 32 questions indicated that the chatbot provided relevant answers on rice cultivation and could be used as a tool for EAs to provide farmers with advice (Table 6).All EAs rated the chatbot responses better than their own answers to the questions, and were willing to use chatbot in the future to get the required information to assist farmers.

Discussion
While chatbot responses were much longer than EAs' responses, the evaluators preferred chatbot-generated responses over those by EAs even when the latter had extension materials.In fact, having extension materials did not significantly improve quality scores and the scores were highly correlated between responses by EAs with and without extension materials.The chatbot is programmed to provide detailed and comprehensive responses, whereas EAs may provide more concise and practical advice based on their experience.However, the study also found that the evaluators preferred chatbot responses over those provided by EAs, even when the latter had extension materials.Although the evaluators valued the detailed and comprehensive information provided by the chatbot, farmers might have different opinions from them.Longer answers by the chatbot could potentially overwhelm farmers with too much information.Further evaluation by farmers is needed, if the chatbot is directly used by farmers.
This result also confirmed a recent study on health 9 , which reported that chatbot responses were preferred over physician responses and rated significantly higher for both quality and empathy.The results from this study suggest that a chatbot might become a useful source of information for advising farmers who have limited access to EAs.However, there was no relationship between scores on the responses by the chatbot and EAs and the chatbot provided inaccurate information related to planting time, seed rate, and fertilizer application rate Table 4. Distribution (%) of evaluators' scores on responses by extension agents (EAs) with and without extension materials and chatbot to 32 questions.Evaluators judged "the quality of information provided" with scores as very poor, poor, acceptable, good, or very good.Within a column, different letters indicate statistically significant differences (P ≤ 0.05).As the scores recorded vary across evaluators, we made an average for each score.The percent can be different depending on the average; for example, if the average number of responses with a score of 3 is 7.75 and 8.25, the percent is 24 and 26, respectively.*** statistical significance at and 0.1% (P < 0.001) level; ns, not significant.Values in the brackets denote the value in percentage.www.nature.com/scientificreports/information on location and rice production system and protected farmers' identities.Table 1 shows the list of 32 questions used in this study, which covered a wide range of agronomic interventions including seed, variety, land preparation, crop establishment method, and management of nutrient, water, weeds, and insects and disease.On August 10, 2023, the full text of the questions (Table 1) was put into a fresh chatbot session 8 free of prior questions that could bias the results, and the chatbot response was saved in a Word file.
Six EAs were nominated from an agricultural extension office in Kano based on their expertise and knowledge of rice cultivation practices.To protect EAs' identities, we do not specify names of the organizations in this paper.Three of the agents were women.None of them had used a chatbot for their extension services before.They were divided into two groups.One group (three agents) used extension materials for answering questions, while the other group did not.They wrote answers to questions on paper in their offices under the supervision of enumerators.The number of words in the responses by EAs with/without extension materials and the chatbot were counted.After EAs completed their responses, they reviewed the chatbot responses and were then asked about its potential use.
After all responses from the six EAs and the chatbot were compiled, for each question, order of the seven answers were randomized.So that, the order can be different from one question to another.Then, we labeled 1 to 7 in each question to blind evaluators to the identity of the responders.We eliminated information that could be used to identify respondents' identity by evaluators (for a chatbot, we eliminated statements such as "I'm an artificial intelligence").All the responses were evaluated by four local rice experts-two from research organizations and others from public extension agencies having good knowledge of local rice production.The evaluators were asked to judge the quality of the responses in terms of local relevance using Likert scales (1, very poor; 2, poor; 3, acceptable; 4, good; and 5, very good).
Scores were averaged across evaluators for each question.This method is used when there is no ground truthing in the outcome being studied, and the evaluated outcomes themselves are inherently subjective.Thus, the mean score reflects evaluator consensus, and disagreements (or inherent ambiguity, uncertainty) between evaluators is reflected in the score variance.Thus, analysis of variance (ANOVA) was conducted to assess difference in the quality score of EAs with and and without extension materials responses to ChatGPT responses.The chi-squared test was applied to identify significant differences between evaluators' scores on responses by extension agents (EAs) with and without extension materials and chatbot.For the chi-squared test, the null hypothesis states that there is no significant difference between the evaluators' scores, whereas the alternative hypothesis states that these scores differ.We employed a t-test to compare the difference in the number of words in EAs and chatbot responses because the number of words in EAs with and without content is similar.Shapiro and Bartlett tests were used before ANOVA and t-tests to ensure that the data had a normal distribution and was homogeneous in terms of variance.Mean separation was done using the Tukey HDS approach.Pearson correlation between scores of the responses of EAs and the chatbot was performed.All statistical analyses were performed in R statistical software, version 4.3.1 14 .
The distribution of the expert assessment of the responses is presented in Fig. 2. We report the percentage of questions for which the chatbot response was preferred and identified the questions in which the chatbot responses had lower scores than those of EAs.

Figure 1 .
Figure 1.Number of words per response authored by extension agents (EAs) and chatbot.As there was no difference in number of words per response by EAs without and with extension materials, data from both were combined.Different letter indicates significant difference (P < 0.001).

Table 2 .
Example of extension agents' and chatbot responses to questions related to rice production in Kano State, Nigeria.Nigeria, the recommended rice variety for irrigated rice cultivation may vary based on several factors such as local agroecological conditions, market demand, and farmers' preferences.However, as of my knowledge cutoff in September 2021, some of the rice varieties commonly recommended for irrigated rice cultivation in Nigeria include: Faro 44: This is a popular variety developed by the National Cereal Research Institute (NCRI).Faro 44 is known for its high yield potential, good grain quality, and resistance to diseases such as rice blast Faro 52: Another variety developed by NCRI, Faro 52 is suitable for irrigated conditions.It is known for its good yield, grain quality, and resistance to diseases For irrigated rice cultivation in Kano State, Nigeria, the recommended types of fertilizers can vary depending on the specific soil conditions, nutrient deficiencies, and the growth stage of the rice crop.However, some commonly used fertilizers for rice cultivation include: NPK fertilizers: NPK stands for nitrogen (N), phosphorus (P), and potassium (K), which are essential macronutrients for rice plants.Balanced NPK fertilizers with a ratio appropriate for rice, such as 15-15-15 or 20-10-10, are commonly used.These fertilizers help provide the necessary nutrients for healthy plant growth, root development, and grain formation Urea: Urea is a common nitrogen fertilizer used in rice cultivation.Nitrogen is particularly important for rice as it is a major component of chlorophyll, which is essential for photosynthesis.Urea is usually applied in split doses during different growth stages of the rice crop Single Superphosphate (SSP) or Triple Superphosphate (TSP): Superphosphate fertilizers are good sources of phosphorus.Phosphorus plays a crucial role in root development, flowering, and seed formation.SSP or TSP fertilizers can be used to address phosphorus deficiencies in the soil Muriate of Potash (MOP): Potassium is essential for overall plant health, water regulation, disease resistance, and grain quality.Muriate of Potash is a common potassium fertilizer used in rice cultivation to ensure an adequate supply of potassium It is important to note that the fertilizer requirements may vary depending on soil nutrient levels, crop rotation practices, and other factors.Conducting a soil test prior to planting can provide valuable information about the specific nutrient needs of your rice crop.It is recommended to consult with local agricultural extension services, soil experts, or agronomists for precise fertilizer recommendations based on the specific conditions in Kano State and the latest agricultural practices What is the best rice establishment method for irrigated lowland rice in Kano State, Nigeria? 6 Nutrient management Please provide with recommended fertilizer application practices for irrigated lowland rice cultivation in Kano State, Nigeria 4 Nutrient management Are there any organic inputs available for irrigated lowland rice cultivation in Kano State, Nigeria? 4 Bird control What are recommended control measures for birds in irrigated lowland rice field in Kano State, Nigeria? 4 Crop establishment What is recommended timing for transplanting rice seedlings from nursery for irrigated lowland rice in Kano State, Nigeria? 3 Abiotic stress How to alleviate flooding problem in lowland rice fields in Kano State, Nigeria? 3 General agronomy What are ways of reducing the cost of irrigated lowland rice production in Kano State, Nigeria? 3 Weed management How to identify dosage for herbicide application for irrigated lowland rice cultivation in Kano State, Nigeria? 3 Crop establishment What is recommended number of seedlings to transplant rice per hill for irrigated lowland rice in Kano State, Nigeria? 2 Variety Are there weed competitive varieties suitable to irrigated lowland rice production in Kano State, Nigeria? 2 Variety Are there early maturing rice varieties suitable to irrigated lowland rice production in Kano State, Nigeria? 2 Water management Please describe the optimum water depth for irrigated lowland rice cultivation in Kano State, Nigeria 2 Insect & disease management How to identify dosage for pesticide application for irrigated lowland rice cultivation in Kano State, Nigeria? 2 Weed management When is recommended timing of herbicide application for irrigated lowland rice cultivation in Kano State, Nigeria? 2 Insect & disease management How to control rust in rice plants in Kano State, Nigeria? 2 Insect & disease management When is recommended timing of pesticide application for irrigated lowland rice cultivation in Kano State, Nigeria? 1 Abiotic stress How to manage salinity in irrigated lowland rice cultivation in Kano State, Nigeria? 1 Land preparation What are recommended land preparation practices for irrigated lowland rice production in Kano State, Nigeria? 1

Table 3 .
Mean scores of responses by extension agents (EAs) with and without extension materials and chatbot to 32 questions.Within a row, different letters indicate statistically significant differences (P ≤ 0.05).

Table 6 .
Responses of the six extension agents after reviewing the chatbot responses.